Task 1.

Image source: National Geographic

Image source: National Geographic

Snowshoe hares have been identified as a ‘keystone’ prey species in northern boreal forests. This study, conducted in the Bonanza Creek Experimental Forest just east of Fairbanks, Alaska, tracks snowshoe hare populations to generate quantitative data that could be used to better explain vegetation and preditor responses to fluctuations in hare populations. Snowshoe hares are known to experience population fluctuations every 8-11 years, as demonstrated by fluctuations in observation counts, with a low in 2002 and a high in 2010, 8 years later (Figure 1.).

Make sure that both your figure and table appear in your final knitted document, each with a useful caption. Include text associated with each to help the audience understand and interpret the results.

library(tidyverse)
library(kableExtra)
library(janitor)
library(lubridate)
library(tidyr)
library(ggfortify)
library(RColorBrewer)

hares<- read_csv("showshoe_lter.csv") 

hares_mf <- hares %>% 
  mutate(sex = str_to_lower(sex)) %>% # change all entries in sex column to lower case
  filter(sex == "m" | sex == "f") %>% # remove all observations with NA or ? in sex
  mutate(date = mdy(date)) %>% # change date format
  mutate(full_date = date) %>% # retain date column
  separate(date, into = c("year", "month", "day")) %>% #create new columns with parsed date info
  mutate(month = as.numeric(month)) %>% 
  mutate(season = ifelse(month %in% c(1:4, 11, 12), "winter", "summer")) %>% # organize months into "summer" and "winter"
  mutate(year = as.numeric(year))
  
# seems like there are a lot more observations in the summer than in the winter so maybe not an interesting thing to use for visualization

# explore graph
ggplot(data = hares_mf, aes(x = weight, y = hindft))+
  geom_point()+
  facet_wrap(~sex)

# this is not that interesting to me
# try something else that may be more interesting

hares_hist <- ggplot(data = hares_mf, aes(x = year))+
  geom_bar(aes(fill = sex), position = "dodge")+
  scale_fill_manual(values = c("peru", "peachpuff2"), labels = c("Female", "Male"))+
  scale_x_continuous(expand = c(0,0),
                     breaks = seq(1998, 2012, 1))+
  theme_minimal()+
  labs(x = "Year of Observation",
       y = "Observation Count")+
  theme(legend.title = element_blank())

hares_hist

Figure 1. Snowshoe Hare Monitoring Observations in the Bonanza Creek Experimental Forest Between 1998 - 2012.
Total number of observations of female and male snowshoe hares are recorded each year during the monitoring period. Data source: Kielland K., F. S. Chapin, R. W. Ruess. 2017.

# make a summary table for females
hare_summary_f <- hares_mf %>% 
  filter(sex == "f") %>% 
  drop_na(weight, hindft) %>% # take out an values to be able to calculate mean
  group_by(year) %>% # organize by year and sex
  summarize(
    count = length(sex), # number of observations
    avg_weight = round(mean(weight), digits = 0))# mean weight
    
# make a summary table for males
hare_summary_m <- hares_mf %>% 
  filter(sex == "m") %>% 
  drop_na(weight, hindft) %>% # take out an values to be able to calculate mean
  group_by(year) %>% # organize by year and sex
  summarize(
    count = length(sex), # number of observations
    avg_weight = round(mean(weight), digits = 0)) # mean weight

# combine the two tables!
hare_summary <- merge(hare_summary_f, hare_summary_m, by = "year")

# format into a single pretty table
hare_table <- kable(hare_summary, align = "c", col.names = c("Year", "Count", "Average Weight (g)", "Count", "Average Weight (g)"), caption = "Table 1. Sumamry Statistics for Snowshoe Hare Monitoring Observations at Bonanza Creek Experimental Forest Between 1998 - 2012. Annual observation counts and average weight of female and male snowshoe hares are reported for each of the years during the monitoring period") %>% 
  add_header_above(c(" " = 1, "Female" = 2, "Male" = 2)) %>% 
  kable_styling(bootstrap_options = c("striped"), full_width = FALSE)

hare_table
Table 1. Sumamry Statistics for Snowshoe Hare Monitoring Observations at Bonanza Creek Experimental Forest Between 1998 - 2012. Annual observation counts and average weight of female and male snowshoe hares are reported for each of the years during the monitoring period
Female
Male
Year Count Average Weight (g) Count Average Weight (g)
1998 10 1746 5 1678
1999 113 1335 136 1241
2000 78 1517 47 1302
2001 25 1482 27 1420
2002 10 1014 17 1301
2003 33 1206 23 1223
2004 45 1536 29 1372
2005 47 1162 36 1297
2006 29 1283 15 1443
2007 57 1412 25 1190
2008 87 1331 76 1360
2009 117 1281 100 1325
2010 96 1299 43 1350
2011 85 1339 43 1400
2012 24 1291 21 1331

Data source: Kielland K., F. S. Chapin, R. W. Ruess. 2017. Snowshoe hare physical data in Bonanza Creek Experimental Forest: 1999-Present. Environmental Data Initiative. https://doi.org/10.6073/pasta/03dce4856d79b91557d8e6ce2cbcdc14.

Task 2.

This dataset, compiled and provided by @zander_venter on Kaggle, is a compilation of data aquired through Google Earth Engine (https://earthengine.google.com/). The dataset uses publically available remote sensing data, where most of the data is derived by calculating the mean for each country at a scale of about 10km. In order to understand the relationship between different environmental and climatic variables included in the dataset, a principal components analysis is provided below to demonstrate directionality of the greatest source of variance within variables considered. These variables were chosen to demonstrate the relationship of variance between different climatic variables included in the dataset.

# explore data

world_env <- read_csv("world_env_vars.csv") %>% 
  clean_names(case = "upper_camel")

summary(world_env) # check out the dataset
# select variables of interest for principal components analysis
env_subset <- world_env %>% 
  select(Elevation, TreeCanopyCover, Isothermality, RainMeanAnnual, TempMeanAnnual, Cloudiness) 

# explore missingness

VIM::matrixplot(env_subset, sortby = "Elevation")

# can't see variable names so not that helpful for exploration

naniar::gg_miss_var(env_subset)

# only a few NAs for each variable, will remove them

env_nona <- env_subset %>% 
  drop_na() # remove NA values


env_pca <- prcomp(env_nona, scale = TRUE) # run PCA

plot(env_pca)

biplot(env_pca) # lookin real UGLY! 

# a prettier version...
my_pca_plot <- autoplot(env_pca,
                        colour = NA,
                        loadings.colour = "black",
                        loadings.label = TRUE,
                        loadings.label.size = 3,
                        loadings.label.colour = "black",
                        loadings.label.repel = TRUE)+
  theme_bw()

my_pca_plot

Figure 2. Principal components analysis showing the relationship between variables included in a compilation
of Google Earth Engine datasets.
Data source: @zander_venter.